Introduction

This report explores the relationship between access to electricity and adjusted net national income per capita across different countries, while considering the impact of population and country size. The analysis includes exploratory data visualizations and a linear regression model to uncover trends and relationships.

Data Preparation

# Load the dataset
data <- read.csv("data/merged_data_elect.csv")

# Check for missing values to assess data completeness
summary(data)
##  Country_Name       Country_Code            Year      Access_to_Electricity
##  Length:550         Length:550         Min.   :2000   Min.   :  3.10       
##  Class :character   Class :character   1st Qu.:2005   1st Qu.: 60.65       
##  Mode  :character   Mode  :character   Median :2010   Median : 96.15       
##                                        Mean   :2010   Mean   : 78.22       
##                                        3rd Qu.:2016   3rd Qu.:100.00       
##                                        Max.   :2021   Max.   :100.00       
##                                                                            
##  Adjusted_Net_National_Income_Per_Capita   Population        Population_density
##  Min.   :  141.2                         Min.   :8.910e+04   Min.   :   5.504  
##  1st Qu.:  788.6                         1st Qu.:1.145e+07   1st Qu.:  17.836  
##  Median : 2306.4                         Median :3.277e+07   Median :  74.391  
##  Mean   :10414.1                         Mean   :1.676e+08   Mean   : 164.680  
##  3rd Qu.:14330.6                         3rd Qu.:1.467e+08   3rd Qu.: 217.487  
##  Max.   :64743.9                         Max.   :1.412e+09   Max.   :1301.039  
##  NA's   :12                                                                    
##   Surface_area      CO2_Emissions     
##  Min.   :     180   Min.   :   0.246  
##  1st Qu.:  243610   1st Qu.:  30.529  
##  Median :  796100   Median : 118.870  
##  Mean   : 2461119   Mean   : 242.174  
##  3rd Qu.: 1285220   3rd Qu.: 361.936  
##  Max.   :17098250   Max.   :4143.910  
##                     NA's   :230

Exploratory Data Analysis

Scatter Plot: Electricity vs Income

This scatter plot illustrates the relationship between access to electricity and adjusted net national income per capita. Each point represents a country-year observation. A smooth line is added to show the general trend in the data.

ggplot(data, aes(x = Adjusted_Net_National_Income_Per_Capita, y = Access_to_Electricity)) +
  geom_point(aes(color = Country_Name)) +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  labs(title = "Access to Electricity vs Income", x = "Income per Capita (USD)", y = "Access to Electricity (%)", color = "Country") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 12 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_point()`).

Population Density and Electricity Access

This bar chart provides insights into how population density correlates with access to electricity across countries. Countries are grouped into categories based on their population density.

data <- data %>%
  mutate(Population_Density_Category = cut(Population_density, breaks = c(0, 50, 100, 200, 500, Inf), 
                                           labels = c("0-50", "51-100", "101-200", "201-500", ">500")))

ggplot(data, aes(x = Population_Density_Category, y = Access_to_Electricity)) +
  geom_boxplot() +
  labs(title = "Access to Electricity by Population Density", x = "Population Density (people per sq. km)", y = "Access to Electricity (%)") +
  theme_minimal()

Geographical Maps

This map visualizes access to electricity across different countries. Countries are shaded according to their electricity access percentage. Additionally, an interactive map for income per capita is included for enhanced exploration.

Access to Electricity

world <- ne_countries(scale = "medium", returnclass = "sf")
world_data <- world %>% 
  left_join(data, by = c("iso_a3" = "Country_Code"))

leaflet(world_data) %>% 
  addTiles() %>% 
  addPolygons(fillColor = ~colorNumeric("YlGn", Access_to_Electricity)(Access_to_Electricity),
              color = "black", weight = 1, opacity = 1, fillOpacity = 0.7,
              label = ~paste(admin, "Electricity:", Access_to_Electricity, "%")) %>% 
  addLegend(pal = colorNumeric("YlGn", world_data$Access_to_Electricity), values = world_data$Access_to_Electricity, 
            title = "Access to Electricity (%)")

Adjusted Net National Income per Capita

leaflet(world_data) %>% 
  addTiles() %>% 
  addPolygons(fillColor = ~colorNumeric("Blues", Adjusted_Net_National_Income_Per_Capita)(Adjusted_Net_National_Income_Per_Capita),
              color = "black", weight = 1, opacity = 1, fillOpacity = 0.7,
              label = ~paste(admin, "Income Per Capita:", Adjusted_Net_National_Income_Per_Capita, "USD")) %>% 
  addLegend(pal = colorNumeric("Blues", world_data$Adjusted_Net_National_Income_Per_Capita), 
            values = world_data$Adjusted_Net_National_Income_Per_Capita, 
            title = "Income Per Capita (USD)")

Modeling

Linear Regression Model

A linear regression model is used to evaluate the influence of adjusted net national income per capita, population, and surface area on access to electricity. Results are displayed in tabular format for clarity.

lm_model <- lm(Access_to_Electricity ~ Adjusted_Net_National_Income_Per_Capita + Population + Surface_area, data = data)

# Display regression results as a table
kable(tidy(lm_model), caption = "Linear Regression Results")
Linear Regression Results
term estimate std.error statistic p.value
(Intercept) 64.0424455 1.5245053 42.008673 0.0000000
Adjusted_Net_National_Income_Per_Capita 0.0009431 0.0000744 12.679026 0.0000000
Population 0.0000000 0.0000000 3.280748 0.0011029
Surface_area 0.0000013 0.0000003 4.474462 0.0000094

Conclusions

  • Access to electricity generally increases with higher income per capita.
  • Countries with larger populations or greater land areas show varying levels of electricity access, suggesting additional influencing factors.